Set topic parameters by diegomrsantos · Pull Request #406 · sigp/anchor

diegomrsantos · 2025-07-03T22:27:40Z

Issue Addressed

Proposed Changes

This PR replaces the old subnet_tracker module with a new subnet_service that tracks subnets and calculates dynamic message rates on epoch boundaries. It updates downstream consumers (network, client, scoring) to use the new API and pre-calculated rates.

Introduces subnet_service crate with dynamic message-rate computation and epoch scheduling
Updates topic-scoring API to accept pre-calculated rates (new_with_rate, topic_score_params_for_subnet_with_rate)
Refactors all crates (network, client, message_sender, fuzz, peer_manager, handshake, discovery) to depend on subnet_service instead of subnet_tracker
creates disable_gossipsub_topic_scoring and sets it to falseby default

Additional Info

Please provide any additional information. For example, future considerations
or information useful for reviewers.

dknopik · 2025-07-04T06:56:22Z

When adding the subnet logic, we decided to separate it into the subnet_service in order to avoid using the network state in the network code. This has had the advantage of avoiding blocking on the lock in the network loop and separating the application logic from the network layer.

What do you think of moving some of the logic here into the subnet_service? Either the full calculation, introducing a UpdateMessageRate(Topic, Rate) event, or just TopicUpdated(Topic, Vec<CommitteeInfo>) which would trigger the actual calculation in the network crate.

diegomrsantos · 2025-07-04T10:52:58Z

When adding the subnet logic, we decided to separate it into the subnet_service in order to avoid using the network state in the network code. This has had the advantage of avoiding blocking on the lock in the network loop and separating the application logic from the network layer.

What do you think of moving some of the logic here into the subnet_service? Either the full calculation, introducing a UpdateMessageRate(Topic, Rate) event, or just TopicUpdated(Topic, Vec<CommitteeInfo>) which would trigger the actual calculation in the network crate.

True, the current implementation needs to be changed. How about?

pub enum SubnetEvent {
    Join(SubnetId, Vec<CommitteeInfo>),
    Leave(SubnetId),
}

dknopik · 2025-07-04T11:29:07Z

@diegomrsantos

This would require us to send Join again to pass the committee info if it changes. While this may simplify some handling, an explicit Update(SubnetId, Vec<CommitteeInfo>) might be a bit clearer. I am on the fence about which is better, so feel free to try out what fits best.

diegomrsantos · 2025-07-04T12:04:57Z

@diegomrsantos

This would require us to send Join again to pass the committee info if it changes. While this may simplify some handling, an explicit Update(SubnetId, Vec<CommitteeInfo>) might be a bit clearer. I am on the fence about which is better, so feel free to try out what fits best.

How about both events with the vec? But the second one maybe UpdateScore

dknopik · 2025-07-04T12:44:17Z

I am also unsure about where we should place the actual calculation of the rate - it kind of fits both crates IMO, depending how you look at it. It feels like an application specific calculation (just like the decision what subnets to subscribe to), and maybe should therefore also live in the subnet_service, with just the message rate being passed to network. But I am kind of torn on that as well, so I leave it up to you

diegomrsantos · 2025-07-04T12:53:05Z

I am also unsure about where we should place the actual calculation of the rate - it kind of fits both crates IMO, depending how you look at it. It feels like an application specific calculation (just like the decision what subnets to subscribe to), and maybe should therefore also live in the subnet_service, with just the message rate being passed to network. But I am kind of torn on that as well, so I leave it up to you

Good observation. I think I'll move message_rate to the subnet_service. Thanks!

…or_subnet there

dknopik · 2025-07-04T14:58:00Z

The current state does not update the params on updated committee data, right? Also, based on the the discussion above, we have to adjust the events passed to network

dknopik

Nice!

dknopik · 2025-07-04T20:51:54Z

anchor/subnet_service/src/lib.rs

+            // For the "all subnets" case, we don't have specific committee info, so pass an empty
+            // vec
+            if let Err(err) = tx.try_send(SubnetEvent::Join(subnet, Vec::new())) {
+                error!(?err, "Impossible error while subscribing to all subnets");
+            }


We can still provide information on the committee in this case.

dknopik · 2025-07-04T20:57:13Z

anchor/subnet_service/src/lib.rs

+    // `previous_subnets` tracks which subnets were joined in the last iteration.
+    let mut previous_subnets = HashSet::new();
+    // Track committee info hash for each subnet to detect changes efficiently
+    let mut previous_committee_hashes: HashMap<SubnetId, u64> = HashMap::new();


I think these can be unified by hashing subnets, not committees.

But regardless, can we not just calculate the score and compare if the old score is different from the new score?

you guys probably know better than me, but I think this approach maintains cleaner boundaries: subnet service handles membership, network handles scoring. Wouldn't hashing the subnet mean we cant distinguish joining and committee changes? I dont mind either way just my 0.02.

Sorry, my comment above is incorrect, we are hashing subnets here. Still, the set and the map can be unified: After an iteration, a subnet is always either in both previous_subnets and
previous_committee_hashes, or neither. So previous_subnets does not provide value anymore, unless I am missing something.

I think this approach maintains cleaner boundaries

IMO it does not. If we have the network crate calculate the message rate, it needs access to the network state as provided by the database, which we avoided until now. This means one more thing passed into it, one more place where we may wait on a lock, and more domain specific logic (the message rate calculation algorithm) inside it. Note that by inside it I mean the code flow, not the location of the file where the message rate is being calculated.

Basically, the way I think of it is: the network crate handles scoring and subnet membership based on the parameters it is passed. It does not care about how we decide about which subnets to describe on, or how we calculate the subnet message rate, as it does not have any domain logic. We do the same approach for messages: it immediately passes them to other components to avoid any domain logic computations in the code flow of the network loop.

if you unified the hashmap and set could you distinguish for committeeupdate events for committee information changes not just when we join/leave subnet? If we can all good. Thanks for the clarification and dont disagree - the separation as it is here is what I was referring to as a preference and dont want to add state dependencies. Apologies if I oversimplified.

dknopik · 2025-07-04T20:59:42Z

anchor/subnet_service/src/lib.rs

+/// Compute a lightweight hash of committee information to detect changes efficiently
+fn compute_committee_hash(committees: &[CommitteeInfo]) -> u64 {
+    let mut hasher = std::collections::hash_map::DefaultHasher::new();
+
+    // Hash the number of committees first
+    committees.len().hash(&mut hasher);
+
+    // Hash each committee's essential data
+    for committee in committees {
+        // Hash committee members by converting to a sorted vector
+        let mut members: Vec<_> = committee.committee_members.iter().collect();
+        members.sort_unstable();
+        members.hash(&mut hasher);
+
+        // Hash validator indices
+        committee.validator_indices.hash(&mut hasher);
+    }
+
+    hasher.finish()
+}


It might be interesting to check if this is considerably faster than just calculating the score.

Recalculating is faster than computing the hashes, so I'll recalculate everything at every epoch

jking-aus

lgtm - just address daniel's comments

dknopik

Very nice! Only a few nitpicks left

dknopik · 2025-07-08T09:45:55Z

anchor/subnet_service/src/lib.rs

+            let current_state = db.borrow();
+            for subnet in (0..(subnet_count as u64)).map(SubnetId) {
+                let committees_info = get_committee_info_for_subnet(&subnet, &*current_state);
+                let message_rate = message_rate::calculate_message_rate_for_topic::<E>(
+                    &committees_info,
+                    &chain_spec,
+                );
+
+                if let Err(err) = tx.try_send(SubnetEvent::Join(subnet, message_rate)) {
+                    error!(
+                        ?err,
+                        subnet = *subnet,
+                        "Failed to send subnet join event during initialization"
+                    );
+                }
+            }


Currently, this only runs on startup - so we never update the rate when in --subscribe-all-subnets mode

anchor/subnet_service/src/lib.rs

Copilot

Pull Request Overview

Replaces the legacy subnet_tracker with a new subnet_service crate that tracks subnet membership and computes dynamic gossipsub message rates at epoch boundaries, and updates all downstream consumers to use the new API and pre-calculated rates.

Introduces subnet_service with dynamic message-rate computation and epoch scheduling
Updates network, client, scoring, and message-sender modules to use subnet_service and pre-calculated rates
Removes the old subnet_tracker crate and adjusts workspace dependencies

Reviewed Changes

Copilot reviewed 23 out of 25 changed files in this pull request and generated 1 comment.

File	Description
anchor/subnet_tracker/src/lib.rs	Deleted old `subnet_tracker` implementation
anchor/subnet_service/src/lib.rs	Added new `subnet_service` with epoch-based rate updates
anchor/network/src/network.rs	Adapted `Network` to subscribe with optional rates and handle `RateUpdate`
anchor/client/src/lib.rs	Switched from `start_subnet_tracker` to `start_subnet_service`

Comments suppressed due to low confidence (1)

anchor/subnet_service/src/lib.rs:256

[nitpick] The handle_epoch_committee_update logic emits RateUpdate events every epoch—this is critical dynamic behavior. Consider adding unit or integration tests for this function to verify that message rates are recalculated and sent correctly when an epoch boundary is reached.

async fn handle_epoch_committee_update<E: EthSpec>(

Copilot · 2025-07-08T17:15:39Z

anchor/subnet_service/src/lib.rs

+) -> Vec<CommitteeInfo> {
+    network_state
+        .clusters()
+        .values()
+        .filter(|cluster| {
+            let cluster_subnet = SubnetId::from_committee(cluster.committee_id(), SUBNET_COUNT);


In get_committee_info_for_subnet, using the constant SUBNET_COUNT can lead to inconsistencies if the service is initialized with a different subnet_count. Consider passing the runtime subnet_count into this helper instead of the fixed constant to ensure correct committee-to-subnet mapping.

Suggested change

) -> Vec<CommitteeInfo> {

network_state

.clusters()

.values()

.filter(|cluster| {

let cluster_subnet = SubnetId::from_committee(cluster.committee_id(), SUBNET_COUNT);

subnet_count: usize,

) -> Vec<CommitteeInfo> {

network_state

.clusters()

.values()

.filter(|cluster| {

let cluster_subnet = SubnetId::from_committee(cluster.committee_id(), subnet_count);

dknopik · 2025-07-08T20:02:42Z

Neat! Thanks!

One more round of testing over night, fingers crossed!

diegomrsantos · 2025-07-08T20:09:05Z

Neat! Thanks!

One more round of testing over night, fingers crossed!

The topic params are disabled in the current code

dknopik · 2025-07-09T07:21:13Z

anchor/subnet_service/src/lib.rs

+        tokio::select! {
+            // Handle database changes for subnet join/leave (only if not subscribe_all_subnets)
+            _ = db.changed(), if !subscribe_all_subnets => {
+                handle_subnet_changes::<E>(&tx, &mut db, &mut previous_subnets, subnet_count, &chain_spec, disable_gossipsub_topic_scoring).await;
+            }
+
+            // Handle scheduled epoch boundaries (for both modes, but only if scoring is enabled)
+            _ = sleep(next_epoch_delay), if !disable_gossipsub_topic_scoring => {
+                handle_epoch_committee_update::<E>(&tx, &mut db, &previous_subnets, &chain_spec).await;
+                // Recalculate the next epoch delay only after we've processed the epoch boundary
+                next_epoch_delay = calculate_duration_to_next_epoch::<E>(&slot_clock);
+            }
+        }
+    }


select panics if all branches are disabled via if, unless an else branch is provided

I suggest adding an else branch with return, as the network crate should handle this gracefully.

dknopik · 2025-07-09T09:03:59Z

anchor/network/src/config.rs

+    /// Disables gossipsub topic scoring and message rate calculations.
+    pub disable_gossipsub_topic_scoring: bool,


Do we want to add a CLI flag to re-enable this, maybe with a note that it is experimental?

…ub_topic_scoring are true

# Conflicts: # Cargo.lock # anchor/fuzz/Cargo.toml # anchor/fuzz/fuzz_targets/setup.rs

dknopik

LGTM, thank you for tackling this!

diegomrsantos · 2025-07-09T14:09:22Z

LGTM, thank you for tackling this!

Thank you for the great review

This fixes the issues arising with #406. The root cause is as follows: We send messages anonymously, setting a random author ID through `MessageAuthenticity::RandomAuthor`. Libp2p by default penalizes getting sent messages that originate from ourselves. However, as we do not track under which author we published each message, libp2p simply punishes messages that we have published, identified by message ID. The message ID depends only on the message contents. The famed "decided message" is generated after a QBFT instance completes - and as it simply is an aggregation of existing signatures and messages, it can be generated by all participants and actually ends up having identical content in many cases. Nodes generate this message at the same time and send it almost simultaneously. This causes nodes to be penalized for "self" messages, which are not actually self messages. Disable penalizing getting sent messages sent by "self" - which may actually be generated by someone else but happens to be identical.

sigp#371 This PR replaces the old `subnet_tracker` module with a new `subnet_service` that tracks subnets and calculates dynamic message rates on epoch boundaries. It updates downstream consumers (network, client, scoring) to use the new API and pre-calculated rates. - Introduces `subnet_service` crate with dynamic message-rate computation and epoch scheduling - Updates topic-scoring API to accept pre-calculated rates (`new_with_rate`, `topic_score_params_for_subnet_with_rate`) - Refactors all crates (network, client, message_sender, fuzz, peer_manager, handshake, discovery) to depend on `subnet_service` instead of `subnet_tracker` - creates `disable_gossipsub_topic_scoring` and sets it to `false`by default

This fixes the issues arising with sigp#406. The root cause is as follows: We send messages anonymously, setting a random author ID through `MessageAuthenticity::RandomAuthor`. Libp2p by default penalizes getting sent messages that originate from ourselves. However, as we do not track under which author we published each message, libp2p simply punishes messages that we have published, identified by message ID. The message ID depends only on the message contents. The famed "decided message" is generated after a QBFT instance completes - and as it simply is an aggregation of existing signatures and messages, it can be generated by all participants and actually ends up having identical content in many cases. Nodes generate this message at the same time and send it almost simultaneously. This causes nodes to be penalized for "self" messages, which are not actually self messages. Disable penalizing getting sent messages sent by "self" - which may actually be generated by someone else but happens to be identical.

sigp#371 This PR replaces the old `subnet_tracker` module with a new `subnet_service` that tracks subnets and calculates dynamic message rates on epoch boundaries. It updates downstream consumers (network, client, scoring) to use the new API and pre-calculated rates. - Introduces `subnet_service` crate with dynamic message-rate computation and epoch scheduling - Updates topic-scoring API to accept pre-calculated rates (`new_with_rate`, `topic_score_params_for_subnet_with_rate`) - Refactors all crates (network, client, message_sender, fuzz, peer_manager, handshake, discovery) to depend on `subnet_service` instead of `subnet_tracker` - creates `disable_gossipsub_topic_scoring` and sets it to `false`by default

This fixes the issues arising with sigp#406. The root cause is as follows: We send messages anonymously, setting a random author ID through `MessageAuthenticity::RandomAuthor`. Libp2p by default penalizes getting sent messages that originate from ourselves. However, as we do not track under which author we published each message, libp2p simply punishes messages that we have published, identified by message ID. The message ID depends only on the message contents. The famed "decided message" is generated after a QBFT instance completes - and as it simply is an aggregation of existing signatures and messages, it can be generated by all participants and actually ends up having identical content in many cases. Nodes generate this message at the same time and send it almost simultaneously. This causes nodes to be penalized for "self" messages, which are not actually self messages. Disable penalizing getting sent messages sent by "self" - which may actually be generated by someone else but happens to be identical.

diegomrsantos self-assigned this Jul 3, 2025

diegomrsantos added the network label Jul 3, 2025

set param for topic

7516535

diegomrsantos force-pushed the set-topic-parameters branch from 3b2f0d6 to 7516535 Compare July 4, 2025 10:19

diegomrsantos requested a review from Copilot July 4, 2025 10:20

This comment was marked as outdated.

Sign in to view

call update_topic_score_for_subnet

4f2e06a

move message_rate.rs to subnet_tracker

819f2de

diegomrsantos marked this pull request as ready for review July 4, 2025 13:33

diegomrsantos requested review from Copilot and dknopik July 4, 2025 13:33

This comment was marked as outdated.

Sign in to view

diegomrsantos force-pushed the set-topic-parameters branch from 6135908 to 222eaea Compare July 4, 2025 14:36

rename subnet_tracker to subnet_service and move get_committee_info_f…

21e561b

…or_subnet there

diegomrsantos requested a review from Copilot July 4, 2025 14:41

This comment was marked as outdated.

Sign in to view

diegomrsantos force-pushed the set-topic-parameters branch from 222eaea to 21e561b Compare July 4, 2025 16:32

diegomrsantos added 2 commits July 4, 2025 18:35

update topic score on committee change

9a12aa9

store a hash instead of a vec

c11a8fd

dknopik reviewed Jul 4, 2025

View reviewed changes

dknopik requested a review from jking-aus July 4, 2025 21:00

dknopik added the v0.2.0 Second testnet release label Jul 4, 2025

jking-aus previously approved these changes Jul 7, 2025

View reviewed changes

diegomrsantos requested a review from jking-aus July 8, 2025 09:09

dknopik reviewed Jul 8, 2025

View reviewed changes

anchor/subnet_service/src/lib.rs Outdated Show resolved Hide resolved

diegomrsantos added 4 commits July 8, 2025 15:27

simplify calculate_duration_to_next_epoch

6643750

fix next_epoch_delay calculation

58dda89

update msg rate for --subscribe-all-subnets case

ea59da0

add an option to disable topic scoring

b2a7638

diegomrsantos requested a review from Copilot July 8, 2025 17:14

Copilot AI reviewed Jul 8, 2025

View reviewed changes

simplify message rate calculation

f343fbe

diegomrsantos requested a review from dknopik July 8, 2025 17:52

dknopik reviewed Jul 9, 2025

View reviewed changes

diegomrsantos added 2 commits July 9, 2025 13:24

return from subnet_service when subscribe_all_subnets disable_gossips…

9e67842

…ub_topic_scoring are true

Merge branch 'unstable' into set-topic-parameters

05c3f53

# Conflicts: # Cargo.lock # anchor/fuzz/Cargo.toml # anchor/fuzz/fuzz_targets/setup.rs

diegomrsantos force-pushed the set-topic-parameters branch from 92a4596 to 5bd856a Compare July 9, 2025 13:03

dknopik force-pushed the set-topic-parameters branch from 48c8ca4 to 5bd856a Compare July 9, 2025 13:19

add hidden param for disable_gossipsub_topic_scoring

d193578

diegomrsantos force-pushed the set-topic-parameters branch from 5bd856a to d193578 Compare July 9, 2025 13:31

dknopik approved these changes Jul 9, 2025

View reviewed changes

dknopik added the ready-for-merge label Jul 9, 2025

mergify bot merged commit 4df75e0 into sigp:unstable Jul 9, 2025
14 checks passed

		/// Disables gossipsub topic scoring and message rate calculations.
		pub disable_gossipsub_topic_scoring: bool,

Comments

Conversation

diegomrsantos commented Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue Addressed

Proposed Changes

Additional Info

Uh oh!

dknopik commented Jul 4, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

diegomrsantos commented Jul 4, 2025

Uh oh!

dknopik commented Jul 4, 2025

Uh oh!

diegomrsantos commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dknopik commented Jul 4, 2025

Uh oh!

diegomrsantos commented Jul 4, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

dknopik commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dknopik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jking-aus left a comment

Choose a reason for hiding this comment

Uh oh!

dknopik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

dknopik commented Jul 8, 2025

Uh oh!

diegomrsantos commented Jul 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

diegomrsantos commented Jul 3, 2025 •

edited

Loading

diegomrsantos commented Jul 4, 2025 •

edited

Loading

dknopik commented Jul 4, 2025 •

edited

Loading